Enriching the ISST-TANL Corpus with Semantic Frames
نویسندگان
چکیده
The paper describes the design and the results of a manual annotation methodology devoted to enrich the ISST–TANL Corpus with Semantic Frames information. The main issues encountered in applying the English FrameNet annotation criteria to a corpus of Italian language are discussed together with the choice of anchoring the semantic annotation layer to the underlying dependency syntactic structure. We also describe an experiment to measure inter-annotator agreement and a first case study to extend and specialise FrameNet annotation to a corpus of legislative texts.
منابع مشابه
Comparing the Influence of Different Treebank Annotations on Dependency Parsing
As the interest of the NLP community grows to develop several treebanks also for languages other than English, we observe efforts towards evaluating the impact of different annotation strategies used to represent particular languages or with reference to particular tasks. This paper contributes to the debate on the influence of resources used for the training and development on the performance ...
متن کاملA Supervised Approach for Enriching the Relational Structure of Frame Semantics in FrameNet
Frame semantics is a theory of linguistic meanings, and is considered to be a useful framework for shallow semantic analysis of natural language. FrameNet, which is based on frame semantics, is a popular lexical semantic resource. In addition to providing a set of core semantic frames and their frame elements, FrameNet also provides relations between those frames (hence providing a network of f...
متن کاملThe Tanl tagger for Named Entity Recognition on Transcribed Broadcast News
The Tanl tagger is a configurable tagger based on a Maximum Entropy classifier, which uses dynamic programming to select the best sequences of tags. We applied it to the NER tagging task, customizing the set of features to use, and including features deriving from dictionaries extracted from the training corpus. The final accuracy of the tagger is further improved by applying simple heuristic r...
متن کاملEnriching the Output of a Parser Using Memory-based Learning
We describe a method for enriching the output of a parser with information available in a corpus. The method is based on graph rewriting using memorybased learning, applied to dependency structures. This general framework allows us to accurately recover both grammatical and semantic information as well as non-local dependencies. It also facilitates dependency-based evaluation of phrase structur...
متن کاملThe Tanl Lemmatizer Enriched with a Sequence of Cascading Filters
We have extended an existing lemmatizer, which relies on a lexicon of about 1.2 millions form, where lemmas are indexed by rich PoS tags, with a sequence of cascading filters, each one in charge of dealing with specific issues related to out-of-dictionary words. The last two filters are devoted to resolve semantic ambiguities between words of the same syntactic category, by querying external re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012